NOTE: During private preview stage, Metrics Advisor for Equipment is only available in West Europe, East Europe, and UK South. Please make sure your subscription and resources are created in one of the available regions.
Next, you're going to fill out the Create Metrics Advisor fields with the following values:
Confirm and select Review + Create.
Azure will run a quick validation check, after a few seconds you should see a green banner that says Validation Passed.
Once the validation banner appears, select the Create button from the bottom-left corner.
After you select Create, you'll be redirected to a new page that says Deployment in progress. The deployment could take up to 60 minutes to complete, although it normally finishes in less than 10 minutes.
Once the deployment is done, you'll see a message that says, Your deployment is complete. Then, select the Go to resource button.
Copy the KEY and endpoint values from your Metrics Advisor resource, paste them in a convenient location, such as Microsoft Notepad. You'll need the key and endpoint values to connect your application to the Metrics Advisor for Equipment API.
If your overview page doesn't have the keys and endpoint visible, you can select the Keys and Endpoint button under the Resource Management section on the left navigation bar and retrieve them there.
That's it! You're now ready to start scaling predictive maintenance using Azure Metrics Advisor for Equipment.
For example, if you'd like to add a dataset that contains preprocessed data that is -
You can call [PUT] https://{endpoint}/datasets/{datasetName}
and enter your information following the request sample below to add a dataset.
Sample Request
{
"parameters": {
"endpoint": "{endpoint}", // Your Metrics Advisor resource endpoint URL
"apiVersion": "2022-07-10-preview",
"Ocp-Apim-Subscription-Key": "{API key}", // Your Metrics Advisor resource key
"Content-Type": "application/json",
"datasetName": "prod_controlValve_5min_v1", // Unique and case-sensitive
"body": {
"datasetDescription": "{Optional field to add more details}",
"dataSourceInfo": {
"dataSourceType": "SqlServer",
"authenticationType": "ManagedIdentity",
"serverName": "{your_sql_server_name}", // Name of the Azure SQL server with your training, evaluation, and/or inference data
"databaseName": "{your_database_name}", // Your Azure SQL database name (must be in the above server)
"tableName": "{your_sql_table_name}" //Your Azure SQL table / SQL view name (must be in the above database)
},
"dataSchema": {
"dataSchemaType": "LongTable",
"timestampColumnName": "eventTimeStamp", // Replace with header of the column that contains datetime values
"variableColumnName": "sensorName", // Replace with header of the column that contains the names of your input variables
"valueColumnName": "sensorValue" // Replace with header of the column that contains numeric values
},
"dataGranularityNumber": 5,
"dataGranularityUnit": "Minutes"
}
}
}
Response
You will get either a 201 or 200 reponse if the request was successful.
Request headers
Ocp-Apim-Subscription-Key
: Your Metrics Advisor resource key. This subscription key provides you access to this API.
Content-Type
: Media type of the body sent to the API.
URI parameter
datasetName
: Unique identifier of a dataset. Cannot be changed once a dataset has been created.
Query parameter
Required
apiVersion
: The current API version is 2022-07-10-preview.
Request body parameters
The request accepts the following data in JSON format.
Parameters | Description | Type | Pattern |
---|---|---|---|
datasetDescription | Optional. Provide more information about this dataset. | string | Character length: [0, 1024] |
dataGranularityNumber | Required. (Together with dataGranularityUnit) The frequency interval at which new records are added to your data. | int32 | |
dataGranularityUnit | Required. (Together with dataGranularityNumber) The unit of your data frequency interval. | string | Valid values: Minutes, Hours, Days, Weeks, Months, Years. |
dataSourceInfo | |||
dataSourceType | Required. Type of your data scource. | string | Valid value: SqlServer |
authenticationType | Required. Method to authenticate Metrics Advisor for Equipment to access your data source. | string | Valid value: ManagedIdentity |
serverName | Required if dataSourceType = SqlServer. Name of a SQL Server. | string | Case-sensitive: No |
databaseName | Required if dataSourceType = SqlServer. Name of a SQL Database. | string | Case-sensitive: Yes |
tableName | Required if dataSourceType = SqlServer. Name of a SQL table or SQL view. | string | Case-sensitive: Yes |
dataSchema | |||
dataSchemaType | Required. Indicates how your data is formatted. <ul><li>LongTable : A long-form data table has a single column that stores all the variables</li><li>WideTable : A wide-form data table spreads variables across several columns</li></ul> | string | Valid value: "LongTable" |
timestampColumnName | Required. Header of the column that contains datetime values | string | Case-sensitive: Yes |
variableColumnName | Required if dataSchemaType = LongTable. Header of the column that contains the the names of your input variables | string | Case-sensitive: Yes |
valueColumnName | Required if dataSchemaType = LongTable. Header of the column that contains numeric values | string | Case-sensitive: Yes |
Q: How do I find my SQL server name?
A: In cases where you are not able to get an answer from your data engineering team, you can follow the below steps to locate your SQL server with a known database name.
If you do not have an Azure SQL database yet, create a single database.
Q: What are the key things I should keep in mind when preparing my data?
A: Here are three key things you should keep in mind:
Q: How to align my data to a single data granularity?
A: Timestamps should be properly aligned for all your variables through aggregation or re-indexing. For example, your sensors record readings every minute but may not always at the exact same second, then you should align them to the same minute (see below).
In cases where your sensors' readings come at different frequencies, some users prefer to convert time series data with different frequencies into the same frequency.
Q: What's the difference bewteen long and wide tables?
A:
LongTable
: A long-form data table has a single column that stores all the variables. Data stored in this format will have repeated values in the timestamps column.WideTable
: A wide-form data table spreads variables across several columns. Data stored in this format will NOT have repeated values in the timestamps column.[GET] /datasets/{datasetName}
: Get dataset info including data source type, data schema, data granularity, etc.[GET] /datasets[?skip][&maxpagesize][&sortBy][&orderBy]
: List models in a Metrics Advisor resource based on[DELETE] /datasets/{datasetName}
: Delete a dataset in a Metrics Advisor resource. This action doesn't delete the data in the source system.Train a Metrics Advisor for Equipment model with the training dataset you've added.
For example, if you'd like to train a model with the dataset your just added in step two above, you can call [PUT] https://{endpoint}/multivariate/models/{modelName}
to create the model and start the model training process.
Sample Request
{
"parameters": {
"endpoint": "{endpoint}", // Your Metrics Advisor resource endpoint URL
"apiVersion": "2022-07-10-preview",
"Ocp-Apim-Subscription-Key": "{API key}", // Your Metrics Advisor resource key
"Content-Type": "application/json",
"modelName": "controlValve_5min_v1_model", // Unique and case-sensitive
"body": {
"modelDescription": "{Optional field to add more details}",
"datasetName": "prod_controlValve_5min_v1",
"trainingTimeRangeList": [
{
"startTime": "2022-03-17T08:00:00.000Z",
"endTime": "2022-05-01T08:00:00.000Z"
},
{
"startTime": "2022-05-02T08:00:00.000Z",
"endTime": "2022-06-16T08:00:00.000Z"
}
], // You can exclude certain time ranges for training here
"slidingWindow": 300,
"alignPolicy": {
"alignMode": "Outer",
"fillNAMethod": "Customized",
"paddingValue": 0
}
}
}
}
Response
You will get either a 201 or 200 reponse if the request was successful.
Request headers
Ocp-Apim-Subscription-Key
: Your Metrics Advisor resource key. This subscription key provides you access to this API.
Content-Type
: Media type of the body sent to the API.
URI parameter
modelName
: Unique identifier of a dataset. Cannot be changed once a dataset has been created.
Query parameter
Required
apiVersion
: The current API version is 2022-07-10-preview.
Request body parameters
The request accepts the following data in JSON format.
Parameters | Description | Type | Pattern |
---|---|---|---|
modelDescription | Optional. Provide more information about this model. | string | Character length: [0, 1024] |
datasetName | Required. Data to be used for model training. | string | Case-sensitive: Yes |
trainingTimeRangeList | Required. A list of time ranges used for model training. Both the start and end timestamps are inclusive. | string | yyyy-MM-ddTHH:mm:ss ; conform to ISO 8601 standard |
slidingWindow | Optional. Controls how many previous data points get used to determine if the next data point is an anomaly. | int32 | <ul><li>Value range: [28, 2880]</li><li>Default value: 300</li></ul> |
alignMode | Optional. How to align variables to the same data frequency interval before further processing. Inner mode returns results on timestamps where EVERY variable has a value. Outer mode returns results on timestamps where ANY variable has a value. | string | Default value: Outer |
fillNAMethod | Optional. How to populate any missing values in the dataset. <ul><li>Linear : Fill nan values by linear interpolation.</li><li>Previous : Fill with the last valid value. E.g., [1, 2, nan, 3, nan, 4] -> [1, 2, 2, 3, 3, 4]</li><li>Subsequent : Fill with the next valid value. E.g., [1, 2, nan, 3, nan, 4] -> [1, 2, 3, 3, 4, 4]</li><li>Customized : Fill nan values with a specified valid value specified in paddingValue </li></ul> | string | Default value: Linear |
paddingValue | Optional. Specify the value to be used for Customized fillNAMethod. This is required if you chose Customized fillNAMethod but optional for other methods. | float32 |
Q: What types of timestamps / time ranges should I exclude for model training?
A: There are two types of data you should exclude from training to reduce noise:
Q: How will slidingWindow
be used?
A: slidingWindow
will also be used in streming inference and model evaluation. Let's take a look at these two scenarios separately.
Suppose you have set slidingWindow
= 1,440, and your input data is at one-minute granularity.
2021-01-03T00:00:00Z
is anomalous. Your startTime
and endTime
will be the same value ("2021-01-03T00:00:00Z"). Your inference data source, however, must contain at least (1,440 * 2) + 1
timestamps. Given thant 1,440 = 60 * 24, so your input data must start from at latest 2021-01-01T00:00:00Z
. 2 * slidingWindow
or 2 * 1,440 in this case. We take twice the length ofyour sliding windown to prevent errors due to missing values in your leading data.endTime
will be greater than startTime
. Inference in such scenarios is performed in a "moving window" manner. 2021-01-01T00:00:00Z
to 2021-01-02T23:59:00Z
(inclusive) to determine whether data at 2021-01-03T00:00:00Z
is an anomaly. Then it moves forward and uses data from 2021-01-01T00:01:00Z
to 2021-01-03T00:00:00Z
(inclusive) to determine whether data at 2021-01-03T00:01:00Z
is an anomaly. It moves on in the same manner (taking 2 * 1,440 data points to compare) until the last timestamp specified by endTime
(or the actual last timestamp).slidingWindow
+ (endTime
- startTime
).Q: How to determine the slidingWindow
value for my model?
A: slidingWindow
contorls how many previous data points get used to determine if the next data point is an anomaly. For examle, if you set slidingWindow = k, then at least k+1 points should be accessible from the source file during inference to get valid detection results. Otherwise, you may get a "InsufficientHistoricalData" error.
Please keep two things in mind when choosing a slidingWindow
value:
Q:What's the difference between Inner and Outer alignMode?
A: Inner
mode returns results on timestamps where EVERY variable has a value (i.e. the intersection of all variables). Outer
mode returns results on timestamps where ANY variable has a value (i.e. the union of all variables).
Here is an example to explain different alignModel
values:
Variable-1
timestamp | value |
---|---|
2020-11-01 | 1 |
2020-11-02 | 2 |
2020-11-04 | 4 |
2020-11-05 | 5 |
Variable-2
timestamp | value |
---|---|
2020-11-01 | 1 |
2020-11-02 | 2 |
2020-11-03 | 3 |
2020-11-04 | 4 |
Inner
join (i.e., take the intersection of the two variables)
timestamp | Variable-1 | Variable-2 |
---|---|---|
2020-11-01 | 1 | 1 |
2020-11-02 | 2 | 2 |
2020-11-04 | 4 | 4 |
Outer
join (i.e., take the union of the two variables)
timestamp | Variable-1 | Variable-2 |
---|---|---|
2020-11-01 | 1 | 1 |
2020-11-02 | 2 | 2 |
2020-11-03 | nan | 3 |
2020-11-04 | 4 | 4 |
2020-11-05 | 5 | nan |
Q: Is there a limit on the number of models I can create?
A: Yes, for detailed numebers please refer to the Quotas and Limits page for the latest information.
[GET] /multivariate/models/{modelName}
: Get model info including training dataset, training time range(s), variables being used, training job status, etc.[GET] /multivariate/models[?skip][&maxpagesize][&sortBy][&orderBy][&status][&datasetNames][&topPerDataset]
: List models in a Metrics Advisor resource.[DELETE] /multivariate/models/{modelName}
: Delete a model in a Metrics Advisor resource.As the create model API is asynchronous, the model will not be ready to use immediately after you called create model API. Rather, you can query the status of models in two ways:
To check the status and details of a list of models, you can call [GET] /multivariate/models
and apply filters and/or sorting methods of your choice to this list.
For example, if you'd like to list:
Your request URL will look like:
https://{endpoint}/metricsadvisor/adel/multivariate/models?api-version=2022-07-10-preview&skip=0&maxpagesize=1&sortBy=modelName&orderBy=DESCENDING&status=CREATED&datasetNames=prod_controlValve_5min_v1,prod_controlValve_10min_v1&topPerDataset=2
Here is a sample of the response you would get:
// Sample response
{
"responses": {
"200": {
"body": {
"value": [
{
"modelName": "controlValve_5min_v1_model",
"modelDescription": "Control valve model trained with 3-month data (excl. planned maintenance in May).",
"datasetName": "prod_controlValve_5min_v1",
"trainingTimeRangeList": [
{
"startTime": "2022-03-17T08:00:00.000Z",
"endTime": "2022-05-01T08:00:00.000Z"
},
{
"startTime": "2022-05-02T08:00:00.000Z",
"endTime": "2022-06-16T08:00:00.000Z"
}
],
"slidingWindow": 300,
"alignPolicy": {
"alignMode": "Outer",
"fillNAMethod": "Customized",
"paddingValue": 0
},
"diagnosticsInfo": {
"modelState": {
"epochIds": [],
"trainLosses": [],
"validationLosses": [],
"latenciesInSeconds": []
},
"variableStates": []
},
"status": "CREATED",
"statusUpdatedTime": "2022-07-16T08:21:24.409Z",
"errors": [],
"createdTime": "2022-07-16T08:21:24.409Z"
}
],
// use nextLink to get more results
"nextLink": "https://{endpoint}/metricsadvisor/adel/multivariate/models?api-version=2022-07-10-preview&skip=1&maxpagesize=1&sortBy=modelName&orderBy=DESCENDING&status=CREATED&datasetNames=prod_controlValve_5min_v1,prod_controlValve_10min_v1&topPerDataset=2"
}
}
}
}
To get the status and details of a specific model with its model name, you can use [GET] /https://{endpoint}/multivariate/models/{modelName}
.
Here is a sample response if you try to get the details for a COMPLETED
model:
// Sample response
{
"responses": {
"200": {
"body": {
"modelName": "controlValve_5min_v1_model",
"modelDescription": "Control valve model trained with 3-month data (excl. planned maintenance in May).",
"datasetName": "prod_controlValve_5min_v1",
"trainingTimeRangeList": [
{
"startTime": "2022-03-17T08:00:00.000Z",
"endTime": "2022-05-01T08:00:00.000Z"
},
{
"startTime": "2022-05-02T08:00:00.000Z",
"endTime": "2022-06-16T08:00:00.000Z"
}
],
"slidingWindow": 300,
"alignPolicy": {
"alignMode": "Outer",
"fillNAMethod": "Customized",
"paddingValue": 0
},
"diagnosticsInfo": {
"modelState": {
// Summarizes information about a model training process
"epochIds": [
10,
20,
30,
40,
50,
60,
70,
80,
90,
100
],
"trainLosses": [
0.6291328072547913,
0.1671326905488968,
0.12354248017072678,
0.1025966405868533,
0.0958492755889896,
0.09069952368736267,
0.08686016499996185,
0.0860302299260931,
0.0828735455870684,
0.08235538005828857
],
"validationLosses": [
1.9232804775238037,
1.0645641088485718,
0.6031560301780701,
0.5302737951278687,
0.4698025286197664,
0.4395163357257843,
0.4182931482799006,
0.4057914316654053,
0.4056498706340729,
0.3849248886108984
],
"latenciesInSeconds": [
0.3398594856262207,
0.3659665584564209,
0.37360644340515137,
0.3513407707214355,
0.3370304107666056,
0.31876277923583984,
0.3283309936523475,
0.3503587245941162,
0.30800247192382812,
0.3327946662902832
]
},
"variableStates": [
//Summarizes detailed information about the variables being used for model training
{
"variable": "temperature_delta",
"filledNARatio": 0,
"effectiveCount": 26208,
"firstTimestamp": "2022-03-17T08:00:00.000Z",
"lastTimestamp": "2022-06-16T08:00:00.000Z"
},
{
"variable": "pressure_delta",
"filledNARatio": 0,
"effectiveCount": 26208,
"firstTimestamp": "2022-03-17T08:00:00.000Z",
"lastTimestamp": "2022-06-16T08:00:00.000Z"
},
{
"variable": "travel_minimumValue",
"filledNARatio": 0.9573031135531136, // filledNARatio > 0 indicates that there are missing values for this varaible at the tiem of training
"effectiveCount": 25089,
"firstTimestamp": "2022-03-17T08:00:00.000Z",
"lastTimestamp": "2022-06-15T15:05:00.000Z"
}
]
},
"status": "COMPLETED",
"statusUpdatedTime": "2022-07-16T08:59:56.592Z",
"errors": [],
"createdTime": "2022-07-16T08:21:24.409Z"
}
}
}
}
Request headers
Ocp-Apim-Subscription-Key
: Your Metrics Advisor resource key. This subscription key provides you access to this API.
URI parameter
NOTE: This is only applicable when you try to get a spcific model by its name.
modelName
: Unique identifier of a model. Cannot be changed once a model has been created.
Query parameter
Required
apiVersion
: The current API version is 2022-07-10-preview.
Optional
NOTE: These optional parameters are only applicable when you try to get a list of models.
skip
: The number of records to skip from the list of records based on the sorting field and ordering method specified.
maxpagesize
: The maximum number of records to be returned per page. If more records are requested via the API, @nextLink will contain the link to the next page.
sortBy
: The name of the field on which you want to sort records.
createdTime
orderBy
: Determines whether the records will be returned in descending or ascending order.
status
: Filter model evaluations by one of the evaluation statuses: CREATED, RUNNING, COMPLETED, or FAILED.
datasetNames
: Filter models by a list of training dataset name(s). For each dataset, by default the list of models are ranked by descending model created time (UTC).
topPerDataset
: The total number of models to be returned per dataset, ordered by created time descending.
Response parameters
Besides what you've input when creating the model, here is a list of additional read-only parameters you would see in the API response:
Parameters | Description | Type |
---|---|---|
modelState | ||
epochIds | How many epochs the model has been trained out of a total of 100 epochs. For example, if the returned epochIds is [10, 20, 30, 40, 50], it means that the model has completed its 50th training epoch, i.e., the training process is 50% completed. | int32[] |
latenciesInSeconds | The time cost (in seconds) for every 10 epochs, which can help you estimate the training completion time. In the sample response above, the 10th epoch takes approximately 0.33 second. | float32[] |
trainLosses | Indicates how well the model fits the training data. | float32[] |
validationLosses | Indicates how well the model fits the test data. | float32[] |
variableStates | ||
variable | The name of the variable being used for model training. | string |
filledNARatio | Proportion of NaN values filled for the variable. For example, if 1000 timestamps was used for training and a given variable only have values for 900 timestamps, then the filledNARatio for this variable is 0.1. | float32 |
effectiveCount | Number of valid data points for the variable. For example, if 1000 timestamps was used for training and a given variable only have values for 900 timestamps, then the effectiveCount for this variable is 900. | int32 |
firstTimestamp | The first timestamp taken from the data source for a given variable. Different variables may have a different firstTimestamp due to missing values. | timestamp |
lastTimestamp | The last timestamp taken from the data source for a given variable. Different variables may have a different lastTimestamp due to missing values. | timestamp |
status | Current status of the model training job. | string |
statusUpdatedTime | The UTC time at which the training status of the model was last updated (if applicable). For instance, the time at which the model status changed from RUNNING to COMPLETED. | timestamp |
errors | Errors during data processing and training. | ErrorResponse property (error codes and messages) |
createdTime | The UTC time at which the model was created. | timestamp |
Q:How to estimate which model is best to use according to training loss and validation loss?
A: Generally speaking, it's hard to decide which model is the best without a labeled dataset. However, we can leverage the training and validation losses to have a rough estimation and discard those bad models.
[PUT] /multivariate/models/{modelName}
: Create and train a model.[DELETE] /multivariate/models/{modelName}
: Delete a model in a Metrics Advisor resource.Now that you've successfully trained a model, we highly recommend you test the quality of a trained model on a new set of data and inspect the detected anomalies by creating a model evaluation.
The results of a model evaluation can be visually inspected and further processed by applied filters based on your business needs, which we will talk more in Step 6: Post-process evaluation results and determine the desired alert settings.
Let's first start with creating an evaluation job.
For example, you've trained a model this data from March 2022 to June 2022 and now you'd like to test the model performance with some new data from the past month (in this case, June-16 to July-15).
To create an evaluation job, you can call [PUT] multivariate/evaluations/{evaluationName}
and kick off the model evaluation process.
Sample Request
{
"parameters": {
"endpoint": "{endpoint}", // Your Metrics Advisor resource endpoint URL
"apiVersion": "2022-07-10-preview",
"Ocp-Apim-Subscription-Key": "{API key}", // Your Metrics Advisor resource key
"Content-Type": "application/json",
"evaluationName": "prod_controlValve_5min_v1_model_evaluation", // Unique and case-sensitive
"body": {
"evaluationDescription": "{Optional field to add more details}",
"modelName": "controlValve_5min_v1_model",
"datasetName": "prod_controlValve_5min_v1", // Ensure this dataset has the same schema as your training dataset
"startTime": "2022-06-16T08:05:00.000Z",
"endTime": "2022-07-15T08:00:00.000Z"
// Both the start and end timestamps are inclusive
}
}
}
Response
You will get either a 201 or 200 reponse if the request was successful.
Request headers
Ocp-Apim-Subscription-Key
: Your Metrics Advisor resource key. This subscription key provides you access to this API.
Content-Type
: Media type of the body sent to the API.
URI parameter
evaluationName
: Unique identifier of a model evaluation. Cannot be changed once an evaluation has been created.
Query parameter
Required
apiVersion
: The current API version is 2022-07-10-preview.
Request body parameters
The request accepts the following data in JSON format.
Parameters | Description | Type | Pattern |
---|---|---|---|
evaluationDescription | Optional. Provide more information about this model evaluation. | string | Character length: [0, 1024] |
modelName | Required. The trained model to be evaluated. | string | Case-sensitive: Yes |
datasetName | Required. The data being used to evaluate the model. This evaluation dataset should have the same schema and data granularity as the training dataset for the model. | string | Case-sensitive: Yes |
startTime | Required. The first timestamp to be used for model evaluation. Ensure that this timestamp has data in your dataset. | timestamp | yyyy-MM-ddTHH:mm:ss ; conform to ISO 8601 standard |
endTime | Required. The last timestamp equal to or less than the end time given will be used for model evaluation. If endTime equals to startTime, one single data point will be processed. Please refer to Quotas and Limits for the maximum number of data points that each evaluation job can take. | timestamp | yyyy-MM-ddTHH:mm:ss ; conform to ISO 8601 standard |
Q: Is model evaluation absolutely needed?
A: Thought model evaluation is not required, this step is highlight recommended for the following reasons:
Q: Is there a limit on the number of evaluations I can create for each model?
A: Yes, for detailed numebers please refer to the Quotas and Limits page for the latest information.
[GET] /multivariate/evaluations/{evaluationName}
: Get model evaluation info including model evaluated, evaluation dataset, evaluation time range, evaluation job status, etc.[GET] /multivariate/evaluations[?skip][&maxpagesize][&sortBy][&orderBy][&status][&modelNames][&topPerModel]
: List model evaluations in a Metrics Advisor resource.[DELETE] /multivariate/evaluations/{evaluationName}
: Delete an evaluation in a Metrics Advisor resource.After your model evaluation job succeeded, you can call the [GET] /multivariate/evaluations/{evaluationName}
to retrieve the evaluation result via an Azure Blob URL provided in the API response body.
In this section, we will also go through how to interpret the evaluation results and tune the results based on your success metrics and business needs.
Sample Response
Let's first look at a sample response when you called [GET] /multivariate/evaluations/{evaluationName}
. Note that you can download the detailed evaluation result thru the Azure Blob URL provided in the resultUrl
field.
// Sample response for [GET] /multivariate/evaluations/{evaluationName}
{
"responses": {
"200": {
"body": {
"evaluationName": "prod_controlValve_5min_v1_model_evaluation",
"evaluationDescription": "Evaluate controlValve_5min_v1_model performance with 0616-0715 data.",
"modelName": "controlValve_5min_v1_model",
"datasetName": "prod_controlValve_5min_v1",
"startTime": "2022-06-16T08:05:00.000Z",
"endTime": "2022-07-15T08:00:00.000Z",
"status": "COMPLETED",
"statusUpdatedTime": "2022-07-16T14:50:29.324Z",
"errors": [],
"resultUrl": "{azure_blob_url_with_SAS_token}", // Download detailed results with this link
"variableStates": [
//Summarizes information about the variables being used for model training
{
"variable": "temperature_delta",
"filledNARatio": 0,
"effectiveCount": 8352,
"firstTimestamp": "2022-06-16T08:05:00.000Z",
"lastTimestamp": "2022-07-15T08:00:00.000Z"
},
{
"variable": "pressure_delta",
"filledNARatio": 0,
"effectiveCount": 8352,
"firstTimestamp": "2022-06-16T08:05:00.000Z",
"lastTimestamp": "2022-07-15T08:00:00.000Z"
},
{
"variable": "travel_minimumValue",
"filledNARatio": 0,
"effectiveCount": 8352,
"firstTimestamp": "2022-06-16T08:05:00.000Z",
"lastTimestamp": "2022-07-15T08:00:00.000Z"
}
],
"createdTime": "2022-07-16T14:26:43.948Z"
}
}
}
}
Evaluation Result Details
The json file you downloaded with the resultUrl
will be look like this:
{
"results": [
{
"timestamp": "2022-07-19T00:27:00.000Z",
"errors": [
{
"code": "InsufficientHistoricalData",
"message": "historical data is not enough."
}
]
},
{
"timestamp": "2022-07-19T00:28:00.000Z",
"value": {
"isAnomaly": false,
"severity": 0,
"score": 0.3928471326828003
},
"errors": []
},
{
"timestamp": "2022-07-19T00:29:00.000Z",
"value": {
"isAnomaly": true,
"severity": 0.5337404608726501,
"score": 0.9171165823936462,
"interpretation": [
{
"variable": "travel_minimumValue",
"contributionScore": 0.6261499548828445,
"correlationChanges": {
"changedVariables": [],
"changedValues": []
}
},
{
"variable": "temperature_delta",
"contributionScore": 0.2585569577470235,
"correlationChanges": {
"changedVariables": [
"pressure_delta"
],
"changedValues": [
0.1229392
]
}
},
{
"variable": "pressure_delta",
"contributionScore": 0.115293087370132,
"correlationChanges": {
"changedVariables": [
"temperature_delta"
],
"changedValues": [
0.1093203
]
}
}
// More contributing variables
]
},
"errors": []
}
// More timestamps
]
}
The results contains a few key concepts:
InsufficientHistoricalData
error: This usually happens only with the first few timestamps because the model inferences data in a window-based manner and it requires enough historical data to make a decision. The minimum amount of historical data required depends on the size of your slidingWindow
(see how slidingWindow is used).
isAnomaly
: false indicates the current timestamp is NOT an anomaly. true indicates an anomaly at the current timestamp.
score
: score is the raw output of the model on which the model makes a decision. Every timestamp should have a score, except the timestamps with InsufficientHistoricalData error.
severity
: this field is a derived value from score and normalized into a number between 0 and 1. It indicates the relative severity of a given anomaly. Note that severity value will only be available is a timestamp has been detected as an anomly (i.e., isAnomaly
= true). For normal timestamps, it is always 0.
interpretation
: this field is only available when isAnomaly
= true for a given timestamp. Information listed aims to explain why the model detected a given timestamp as an anomaly and help you diagnose the root cause. Specifically -
variable
: Name of the top contributing variable to a given anomaly. Contributing variable are ranked in descending contributionScore order.
contributionScore
: Higher contribution scores indicate higher possibility of the root cause.
correlationChanges
: A list of correlated variable(s) whose correlation with this contributing variable has changed significantly. This field can be empty if there were no significant correlation changes between the contributing variable and other variables. changedVariables
contains the list of variable(s) whose correlation to the contributing variable has changed. changedValues
contains values that indicates the extent to which the correlation(s) has changed for a given changedVariable.
First, you should decide the success metrics for the model. Metrics such as precision, true positive rate (i.e., recall), number of false positives per month, average forewarning time, etc. are commonly used ones by our customers.
Then, with the success metrics in mind, you are recommended to compare the evaluation results with label data (timesamps or time ranges where true equipment failures / anomalies happened). At first, there might be too many false alerts being identified by the model. If so, you try set up different threshold on severity
to filter out less severe anomalies and see if the rest anomalies align better with your labels.
For example, you may have 100 timestamps with isAnomaly
= true in the beginning, while probably only 20 of them are true anomalies. Then, you observed that more than half of the false positives have a severity lower than 0.3 while most true anomalies are above, so a threshold of 0.3 on severity could help you filter out a large portion of the false alerts. If your stakeholder only want to be notified when something serious happens, you can further raise the severity threshold to a higher value based on your business stakeholder's preference.
Note: on our web portal and alert configuration APIs, sensitivity
is the name of the concept that's equivalent to a severity threshold. sensitivity
is a number between 0 and 100, and is caculated as (1 -severity
) * 100. In general, a high sensitivity value means the model is more sensitive to outliers and is likely to identify more anomalies. A low sensitivity value usually means the model will tolerate minor outliers.
Finally, once you've tested a few severity threshold (or sensitivity
values), make a note of this number as you will need it when setting up the alert configuration for streaming inference.
[PUT] /alertConfigs/{alertConfigName}
: Create an alert configuration.[GET] /multivariate/evaluations[?skip][&maxpagesize][&sortBy][&orderBy][&status][&modelNames][&topPerModel]
: List model evaluations in a Metrics Advisor resource.[DELETE] /multivariate/evaluations/{evaluationName}
: Delete an evaluation in a Metrics Advisor resource.If you felt confident with the model based on the evaluation result, the next step is to configure your alert preferences before setting up a streaming inference schedule to detect anomalies with the model in real-time.
To do so, you can:
[PUT] /hooks/{hookName}
[PUT] /alertConfigs/{alertConfigName}
Set a lower sensitivity to get notified only when severe anomalies are detected. Set a higher sensitivity to detect anomalies that are less severe.Sample Request
{
"parameters": {
"endpoint": "{endpoint}",
"apiVersion": "2022-07-10-preview",
"Ocp-Apim-Subscription-Key": "{API key}",
"Content-Type": "application/json",
"hookName": "WebhookToDS",
"body": {
"hookType": "Webhook",
"hookDescription": "Webhook for data scientists to receive anomaly alerts for control valves.",
"endpoint": "{https://datasciencecentral.contoso.example/api/AnomalyAlertControlValve}", // Replace with the endpoint to receive alerts
"header": {},
"credential": "https://contoso.key_vault.azure.example/secrets/00000000-0000-0000-0000-000000000000" // Optional if authentication is not needed
}
}
}
Response
You will get either a 201 or 200 reponse if the request was successful.
Request headers
Ocp-Apim-Subscription-Key
: Your Metrics Advisor resource key. This subscription key provides you access to this API.
Content-Type
: Media type of the body sent to the API.
URI parameter
hookName
: Unique identifier of a hook. Cannot be changed once a hook has been created.
Query parameter
Required
apiVersion
: The current API version is 2022-07-10-preview.
Request body parameters
The request accepts the following data in JSON format.
Parameters | Description | Type | Pattern |
---|---|---|---|
hookType | Required. The type of the notification channel to receive alert. Only Webhook is supported in the current version. | string | |
hookDescription | Optional. Detailed description of a hook. | string | |
endpoint | Required. The API address to be called when an alert is triggered. MUST be Https. | string | URL |
header | Optional. Custom headers in the API call. A string map include key-value pairs. | key-value pair(s) | For example, a header could be {"Content-Type": "application/json"} |
credential | Optional. For authenticating to the endpoint. Optional if authentication is not needed. | string | URL |
Sample Request
{
"parameters": {
"endpoint": "{endpoint}",
"apiVersion": "2022-07-10-preview",
"Ocp-Apim-Subscription-Key": "{API key}",
"Content-Type": "application/json",
"alertConfigName": "prod_controlValve_5min_v1_alertconfig",
"body": {
"alertConfigType": "MultiVariateAnomaly",
"alertDescription": "{Add_more_detailed_info_about_this_alert_configuration}",
"hookNames": [
// Optional. Add the hooks that you'd like to receive alert from
"WebhookToSME",
"WebhookToDS"
],
"sensitivity": 70, // Replace with the value that meets your success metrics based on model evaluation results
"correlationWindow": 5,
"suppressCorrelatedAlerts": true
}
}
}
Response
You will get either a 201 or 200 reponse if the request was successful.
Request headers
Ocp-Apim-Subscription-Key
: Your Metrics Advisor resource key. This subscription key provides you access to this API.
Content-Type
: Media type of the body sent to the API.
URI parameter
alertConfigName
: Unique identifier of an alert configuration. Cannot be changed once an alert configuration has been created.
Query parameter
Required
apiVersion
: The current API version is 2022-07-10-preview.
Request body parameters
The request accepts the following data in JSON format.
Parameters | Description | Type | Pattern |
---|---|---|---|
alertConfigType | Required. The type of alert to be set up. | string | Valid values: "MultiVariateAnomaly" |
alertDescription | Optional. Detailed description of an alert configuration. | string | |
hookNames | Optional. Specifies the list of notification channel(s) through which the alerts will be sent. If left blank, anomalies will still be detected but no alerts will be sent out. This parameter is case-sensitive. | string[] | |
sensitivity | Required. An integer between 1 and 100. Set a lower sensitivity if you only want to be notified when severe anomalies are detected. Set a higher number if you want to report as many anomalies as possible. | int32 | Default value: 70 |
correlationWindow | Required. The number of time-series data points to look back and correlate anomalies. For example, if the window is set to 5 and there is an anomaly at 01:30. Assume your data comes every 5 minutes, then the service will check if the last anomaly detected was within the past 25 minutes (i.e., window size * data frequency). If so, the service will correlate this new anomaly at 01:30 with the last anomaly and show an correlation ID in the alert notification. By default, the window is set to 0 and each anomaly is considered an individual incident. | int32 | Default value: 0 |
suppressCorrelatedAlerts | Required. True if you only want to receive one alert for each group of correlated anomalies (the alert will be sent for the earliest anomaly detected in this group). False if you want to receive an alert for every anomaly detected (regardless whether they are correlated or not). | boolean | Default value: true |
Q: What sensitivity
value should I set?
A: sensitivity
is the name of the concept that's equivalent to an anomaly severity threshold. sensitivity
is caculated as (1 -severity
) * 100. In general, a high sensitivity value means the model is more sensitive to outliers and is likely to identify more anomalies. A low sensitivity value usually means the model will tolerate minor outliers.
You are highly recommended to determine the sensitivity value based on your success metrics and model evaluation results.
Q: How will correlationWindow
be used?
A: CorrelationWindow is generally used together with suppressCorrelatedAlerts
to suppress alerts for anomalies that happened in adjacent timestamps since these anomalies likely share the same root cause. When suppressCorrelatedAlerts
is set to false, no alerts will be suppressed regardless of your CorrelationWindow
value. However, if suppressCorrelatedAlerts
is set to true, here is how alerts will be triggered when CorrelationWindow = 5 versus CorrelationWindow = 2:
Hook
[GET] /hooks/{hookName}
: Get hook info including hook type and associated set up details.[GET] /hooks[?skip][&maxapgesize][&sortBy][&orderBy]
: List hooks in a Metrics Advisor resource.[DELETE] /hooks/{hookName}
: Delete a hook in a Metrics Advisor resource.[PATCH] /hooks/{hookName}
: Update a hook. Updatable properties differ by hook type.Alert Confuguration
[GET] /alertConfigs/{alertConfigName}
: Get hook info including hook type and associated set up details.[GET] /alertConfigs[?skip][&maxapgesize][&sortBy][&orderBy]
: List hooks in a Metrics Advisor resource.[DELETE] /alertConfigs/{alertConfigName}
: Delete a evaluation in a Metrics Advisor resource.[PATCH] /alertConfigs/{alertConfigName}
: Update an alert configuration. Updatable properties differ by alert configuraiton type.